An LDPCC decoding algorithm 
based on Bowman-Levin approximation 
— Comparison with BP and CCCP — 
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Abstract — Belief propagation (BP) and the concave convex 
procedure (CCCP) are both methods that utilize the Bethe 
free energy as a cost function and solve information processing 
tasks. We have developed a new algorithm that also uses the 
Bethe free energy, but changes the roles of the master variables 
and the slave variables. This is called the Bowman-Levin (BL) 
approximation in the domain of statistical physics. When we 
applied the BL algorithm to decode the Gallager ensemble of 
short-length regular low-density parity check codes (LDPCC) 
over an additive white Gaussian noise (AWGN) channel, its 
average performance was somewhat better than that of either 
BP or CCCP. This implies that the BL algorithm can also be 
successfully applied to other problems to which BP or CCCP 
has already been applied. 

I. Introduction 

Recently, various statistical inference algorithms have be- 
come of interest in the field of large-scale information pro- 
cessing. Belief propagation (BP) [1] and the concave convex 
procedure (CCCP) [2] are among the most effective of the 
methods which minimize the Bethe free energy [3], [4]. In the 
field of practical application (e.g., the problem of decoding 
low-density parity check code (LDPCC) [5], [6]), BP and 
CCCP have both been successfully applied [7]. 

However, they are not the only methods that minimize the 
Bethe free energy. In this paper, we focus on the method 
of Lagrange undetermined multipliers used by both BP and 
CCCP, and derive a new algorithm by exchanging the roles 
of master variables and slave variables. This approach, called 
Bowman-Levin (BL) approximation [8], is sometimes used in 
the field of statistical physics as a way to find an extremum (a 
saddle, local minimum, or local maximum) of the Bethe free 
energy. 

II. LOW DENSITY PARITY CHECK CODE (LDPCC) 

The LDPCC decoding problem can be handled within 
a Bayesian framework. The prior probability of the codes, 
consisting of N binary bits (x G {+1, — 1} ), is defined as 



M 



where // = 1, M denotes the parity index and fi denotes the 
set of node indices involved in the /i-th parity. Similarly, I — 
1 , . . . , N denotes the bit index and I denotes the set of parity 
indices linking to the Z-th bit. \/j,\ and \l\ denote the degree 
of fi-th parity and the Z-th bit, respectively. The proportion 
means the normalization of a probability function - i.e., the 
summation of the probability for all possible arguments x - 
should be 1. 

We consider a noisy channel with additive white Gaussian 
noise (AWGN); i.e., the probability of the received codes y is 
defined as 
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where a 2 denotes the variance of the noise. The posterior 
probability of the sent code can then be expressed as 





M 






P(x\y) oc 


n 


fi + n*«) 













' N 



(3) 



To infer the sent code x by y, we employ the maximum 
posterior marginal (MPM) solution, 



xi = argmaxV*P(a;|y), 

xi * — ' 



(4) 



which minimizes the bit error rate. On the other hand, the 
maximum a posteriori (MAP) solution minimizes the block 
error rate, 



x = argmaxP(a;|y), 



(5) 



(1) 



but is generally difficult to determine because of the exponen- 
tial calculation cost. 

III. Bethe free energy 

One purpose of the Bethe free energy approach is to 
determine a set of marginal probabilities of a given probability, 
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Fig. 1. Examples of the parity connection, (a) Tree structure and (b) loopy 
structure. 



which provides the MPM solution here. The Bethe free energy, 
F, is defined using Kullback-Leibler (KL) divergence as 
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where P(x\y) can be represented as 
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using (normalized) probability functions, (^^(x^) and ipi(xi) 
in the current case. The introduced test probability functions 
of creeks, {6^(a; M )}, and bits, {qi(xi)}, are required to satisfy 
the consistency of the marginal probabilities: 



b^Xfj,) = qi(xi) (l^n). 



(10) 



{b^Xfj)} and {qi(xi)} that minimize F are expected to 
approximate the marginal probabilities of P(x\y). 

The Bethe free energy approach gives the exact marginal 
probabilities when the parity connection has the tree structure 
(Fig. [Da)). In such cases, any probability of x can be ex- 
pressed as a product of its marginal probabilities, {b^x^)} 
and {qi(xi)}, as 
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Then, the Bethe free energy coincides with the KL divergence 
between the test and the true probabilities, 



F = KL(Q(x)\\P{x\y)), 



(12) 



which implies that minimizing the Bethe free energy leads 
to the correct probability Q(x) = P(x\y), and, therefore, 
the exact MPM solution can be assessed from the obtained 
{qi(xi)}. Unfortunately, the Bethe free energy does not repre- 
sent the KL divergence for loopy graphs (Fig.^b)). However, 



we here attempt to decode the LDPCC by minimizing F under 
the consistency condition dlOt expecting that {qi(xi)} well 
approximate the marginal probabilities even in the case that 
the parity connection does not have the tree structure. 

IV. Lagrange multipliers 

To minimize the Bethe free energy under the constraint i ll Oi l, 
we introduce Lagrange undetermined multipliers, A M /(a;;). The 
objective function to minimize is 



g(vV},m,{a m/ }) = f + l, 



where 
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and we solve the following three equations: 
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Using X[ G {+1,-1} and the normalization conditions of 
the probability functions, we can reduce qi and to linear 
functions as 



l+xi tanh hi 



A M i(x;) = ~xi [hf+i-^ 



We also sometimes use mi = (xi) Ql ( Xl) — oau 
these expressions, we can reduce Eqs. J15I - dl7> 
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respectively. From Eqs. (I20t and J22t . we obtain 
hi = h^i + atanh J^J tanh h^v 



(20) 

(21) 
(22) 

(23) 



Now, we have two types of variable: {hi} and and two 

types of simultaneous equation: d2 1 1 and d23i . 



V. Belief propagation (BP) 

BP considers the double-indexed h, {h^i}, to be the master 
variables. Specifically, from Eqs. (12 1 1 and d23l >. we obtain 

hfi'i + atanh tanh h^y 



for any {l,/i' G I}. BP ingeniously rearranges the left side of 
this equation with the average without fj,: 
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hfi'i + atanh. tanh/i M /;/ 
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(25) 



We then obtain the iterative substitution to converge {/i M z}. 



loop: ft M ; < - 
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atanh J~J tanh h^'i' 



(26) 



Once the master variables are determined, we can easily obtain 
the slave variables, {hi}, by 



result: hi 
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fj,'ei 



(27) 



atanh tanh/j^. 

To lower the calculation cost, we check whether the esti- 
mated sent code, xi = sign ft,/, satisfies all parities for every 
loop of Eq. \26\ . We stop the iteration loop if we reach any 
codeword, or the number of loops reaches an upper limit. 

VI. Concave convex procedure (CCCP) 

CCCP is a double loop algorithm utilizing convex opti- 
mization. The convexity of the Bethe free energy is generally 
not guaranteed because of the negative coefficient, 1 — of 
the second term in Eq. 0. So, CCCP employs the following 
additional term at every outer loop step t. 

N 

F* = F + Y / \l\KL(q l (x l )\\q t l (x l )) (28) 
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Equation d29i guarantees the convexity of F t ({b IJ ,}, {qi}), 
because KL divergence function is convex, and the third term 
is a linear function. Besides, F necessarily decreases if F* 
decreases because the additional term is non-negative, and the 
additional term itself disappears if {qi} converges. 

In the inner loop, similar to BP, CCCP considers the double- 
indexed h, {h^i}, to be the master variables. On the other 
hand, in the outer loop, single-indexed h, {hi}, are treated as 



the master variables. After the convergence of the inner loop, 
the outer loop is performed to determine hV~ . 



inner loop: h 



1 / m 



+hj- atanh J| tanh/i^/ ,(30) 



outer loop: h\ +1 

VII. Bowman-Levin (BL) 
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BL considers the single-indexed h, {hi}, to be the master 
variables. Specifically, BL determines {h^i} by first using 
{hi} in Eq. d23i . It requires some iteration to be solved, 
resulting in an inner loop: 



inner loop: h 



fj,l 



atanh j J tanh h^v . 
I'eiAi 



(32) 



Because the determination of the slave variables, de- 
pends on the provisional values of the master variables, {hi}, 
BL also needs a double-loop algorithm. Eq. J2 1 i implies that 
update 



outer loop: h\ +1 <— — 



(33) 



may be employed for the outer-loop using the converged 
variables {h^i}. 

Eq. i33\ . however, does not provides satisfactory results 
as this empirically increases the Bethe free energy. This is 
because the outer loop J33i is interpreted as 



h] +1 <- hj + k 



dG* 
~dhl 



(34) 



where k 



cosh hi 
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is positive. G* denotes G, which is 
regarded as a function of only {hi} at outer-loop step t. 

In order to resolve this difficulty, we use the natural gradient 
descent method [9] instead of Eq. J33i as 



outer loop: h 
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h\ - kH 
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where k denotes a small positive step width, and H denotes 
the Fisher information matrix defined as 



d\ogQ(x) dlogQ(x) 
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assuming the following approximation: 
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Fig. 2. Block (upper three lines) / bit (lower three lines) error rates of 
BP, CCCP, and BL algorithms. Configurations were as follows: code length 
N = 486, number of parities M = 243, degree of parity = 6, 
degree of bit \l\ = 3 ((3,6)-regular LDPCC), limit of outer loops: 10000 
(the limit of loops in the case of BP), number of inner loops fixed as 
6 for CCCP and BL, and step width k = 0.3 in BL. E h /N [dB] is 
defined as 10 log 1 Q(l/(2cr 2 (TV — M)/N)). The number of communications 
were 10 3 ,3 X 10 S ,10 4 ,3 X 10 4 ,10 5 ,3 X 10 5 , and 10 6 for E b /N = 
1.0, 1.5, 4.0, respectively. Each error bar denotes a 99% confidence 
interval based on a binomial distribution. 

We then obtain 

<9C* 

outer loop: h] +1 <- h\ - k- — , (39) 

ami 

where 

£ = Ev-fh(i<l-i«. m 

VIII. Validation 

We validated the performance of the BL algorithm by 
comparing it with that of BP and CCCP through a simulation 
of the Gallager ensemble of the short-length regular LDPCC. 
As the decoding performance greatly depends on the parity 
check matrix, the simulation was done over an LDPCC ensem- 
ble; that is, we remade the matrix for every communication 
according to the Gallager's construction [5]. We assumed that 
the decoder knows the true noise variance of the AWGN 
channel, a 2 . In the simulation, BL performed somewhat better 
than both BP and CCCP. 

Figure [2] shows the block and bit error rates of each 
algorithm over various signal-to-noise ratios (E^/Nq). BL 
performed better than BP and CCCP, especially in the area 
where E^/Nq was around 2 dB. The error floor appeared in 
the area where E^/No was greater than about 2.5 dB. This 
error floor probably occurred due to the short loop of the parity 
check matrix. 

Figure [5] shows the time (outer-loop step) evolution of the 
rate for both the not-correctly-decoded and wrongly-decoded 
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Fig. 3. Time evolution (outer loop steps) of the not-correctly-decoded 
rates (upper three lines) and wrongly-decoded rates (lower three lines) for BP, 
CCCP, and BL algorithms. The data corresponds to the case of E^/No = 2 
in Fig. [2] 



cases. In the early outer-loop steps, BP tended to reach the 
correct codeword first, and then CCCP and BL followed. In 
the later steps, BL continuously improved the block error 
rate, while CCCP had little effect after about the 500-th 
step. The effect of BP was intermediate. The rates for the 
wrongly-decoded case were almost the same among the three 
algorithms. These results suggest that the BL algorithm will 
outperform BP and CCCP if we can afford a high calculation 
cost - for example, 1000 outer-loop steps. 

The calculation cost is roughly proportional to the number 
of inner loops (we regard that of BP to be 1). So, if we set the 
number of inner loops as 6 for CCCP and BL, the cost ratio of 
BP, CCCP, and BL will be about 1 : 6 : 6. If we consider the 
average number of outer loops, the difference could become 
larger (e.g., 1 : 10 : 12), but this depends on the upper limit 
on the number of outer loops. 

Parallelization is also an important factor regarding calcula- 
tion cost. Briefly, parallelization of the BP loop is possible. It is 
also possible for the outer loops of CCCP and BL, but not for 
the inner loop of CCCP. On the other hand, it is indispensable 
for the inner loop of BL to achieve fast convergence. 

Parameter optimization of the three algorithms is a real 
problem. In the case of BP, we have to determine only the 
upper limit of the outer loops. For CCCP, we also have to 
determine the number of inner loops. For BL, in addition to 
the CCCP parameters, we have to determine the step width of 
the natural gradient descent. Empirically, the configuration of 
the step width appears rather robust since the simulated BL 
performance generally exceeded that of the other algorithms 
(Fig. even though they shared a common step width 
configuration (i.e., k = 0.3). 



The optimization of the parity check matrix is also a 
problem, especially for short-length LDPCC. We will further 
investigate the dependence of these algorithms on the matrix 
in our future work. 

IX. Conclusion 

The method we have proposed minimizes the Bethe free 
energy based on the Bowman-Levin (BL) approximation. The 
BL algorithm combined with the natural gradient descent 
method successfully converges. We have compared our BL 
algorithm to the belief propagation (BP) and concave convex 
procedure (CCCP) algorithms with respect to the decoding 
problem of the Gallager ensemble of short-length regular low- 
density parity check codes (LDPCC) over an additive white 
Gaussian noise (AWGN) channel. Simulation showed that the 
BL algorithm outperformed the BP and CCCP algorithms, 
although the BL calculation cost was greater. This suggests 
that the BL algorithm can be successfully applied to other 
problems to which BP or CCCP have already been applied. 
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